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EXECUTIVE  SUMMARY  (100-200  words): 

The  unifying  goal  of  this  project  was  to  characterize  and  optimize  the  interplay  between  network 
topology,  communication  protocols,  and  estimation  performance.  The  first  part  of  the  project 
considered  wireless  sensor  networks  and  utilized  feedback  from  fusion  sinks  to  optimize 
communication  parameters  for  estimation  objectives.  The  second  part  of  the  project  focused  on 
networks  that  employ  linear  network  coding  and  produced  novel  methods  for  network 
tomography  in  this  particular  setting.  The  third  part  of  the  project,  focused  on  learning  of  graphs, 
including  but  not  limited  to  communication  networks.  In  all  cases,  we  designed  novel  network 
protocols  and  estimation  methods  and  we  showed  that  they  advance  the  state-of-the  art. 
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PART  I:  COMMUNICATION  EFFECTS  IN  WIREEESS  SENSOR  NETWORKS 


Overview:  This  part  of  the  project  uses  feedback  from  network  sinks  (e.g.,  a  fusion  center,  a 
communication  terminal)  to  network  sources  (e.g.,  sensors,  a  multicast  terminal)  to  allocate 
wireless  network  resources  (e.g.,  data  rates,  transmission  gains,  UAV  positions)  in  order  to 
optimize  performance  (e.g.,  connectivity,  estimation  accuracy,  throughput). 
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MIMO  Ad  Hoc  Multicast  Networks,"  IEEE  Trans.  Vehicular  Technology,  Volume  61,  No.  4, 
pp.  1762-1778,  May,  2012. 

F.  Jiang  and  A.  Swindlehurst,  "Optimization  of  UAV  Heading  for  the  Ground-to-Air  Uplink," 
IEEE  J.  of  Sel.  Areas  in  Communications,  Vol.  30,  No.  5,  pp.  993-1005,  June,  2012. 

F.  Jiang,  J.  Chen  and  A.  Swindlehurst,  "Estimation  in  Phase-Shift  and  Forward  Wireless  Sensor 
Networks,"  IEEE  Trans.  Signal  Processing,  Vol.  61,  No.  15,  pp.  3840-3851,  Aug.  2013. 

J.  Chen,  F.  Jiang  and  A.  Swindlehurst,  "The  Gaussian  CEO  Problem  for  Scalar  Sources  with 
Arbitrary  Memory,"  submitted  to  IEEE  Trans.  Infonnation  Theory,  June  2013. 

F.  Jiang,  J.  Chen  and  A.  Swindlehurst,  "Optimal  Power  Allocation  for  Parameter  Tracking  in  a 
Distributed  Amplify-and-Forward  Sensor  Network,"  submitted  to  IEEE  Trans.  Signal 
Processing,  August  2013. 

F.  Jiang  and  A.  Swindlehurst,  "Dynamic  UAV  Relay  Positioning  for  the  Ground-to-Air  Uplink," 
In  Proc.  Int'l  Workshop  on  Wireless  Networking  for  Unmanned  Aerial  Vehicles,  pp.  1766-1770, 
Miami,  FL,  December,  2010. 

F.  Jiang,  J.  Chen  and  A.  Swindlehurst,  "Phase-Only  Analog  Encoding  for  a  Multi-Antenna 
Fusion  Center,"  in  Proc.  IEEE  ICASSP,  pp.  2645-2648,  Kyoto,  Japan,  March,  2012. 

J.  Chen  and  A.  Swindlehurst,  "On  the  Achievable  Sum  Rate  of  Multi-tenninal  Source  Coding  for 
a  Correlated  Gaussian  Vector  Source,"  in  Proc.  IEEE  ICASSP,  pp.  2665-2668,  Kyoto,  Japan, 
March,  2012. 

J.  Chen,  F.  Jiang  and  A.  Swindlehurst,  "The  Gaussian  CEO  Problem  for  a  Scalar  Source  with 
Memory:  A  Necessary  Condition,"  In  Proc.  46th  Asilomar  Conference  on  Signals,  Systems,  and 
Computers,  pp.  1219-1223,  Pacific  Grove,  CA,  November,  2012. 
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F.  Jiang,  J.  Chen  and  A.  Swindlehurst,  "Parameter  Tracking  via  Optimal  Distributed 
Beamfonning  in  an  Analog  Sensor  Network,"  In  Proc.  46th  Asilomar  Conference  on  Signals, 
Systems,  and  Computers,  pp.  1397-1401,  Pacific  Grove,  CA,  November,  2012. 


F.  Jiang,  J.  Chen  and  A.  Swindlehurst,  "Linearly  Recon ligurable  Kalman  Filtering  for  a  Vector 
Process,"  in  Proc.  IEEE  ICASSP,  Vancouver,  BC,  Canada,  May,  2013. 

Summary  of  Technical  Results 


Estimation  in  Gaussian  Networks  and  Multiterminal  Source  Coding:  Wireless  sensor  networks 
(WSNs)  are  often  used  for  distributed  sensing,  in  which  geographically  distributed  sensors  make 
measurements  or  local  estimates  and  forward  them  to  a  fusion  center  (FC),  which  conducts 
further  processing  to  extract  useful  information  from  the  data.  In  practice,  the  local 
measurements  are  typically  quantized  prior  to  transmission,  and  there  is  clearly  a  trade-off 
between  the  level  of  quantization  (or  equivalently  the  sensors'  transmission  rates)  and  the  final 
estimation  accuracy.  With  knowledge  of  the  required  accuracy  and  the  statistical  characteristics 
of  the  source  and  noise,  the  fusion  center  can  optimally  determine  the  sensors'  individual 
transmission  rates  and  feed  this  information  back  to  the  sensors  in  order  to  efficiently  use  the 
available  computing  and  communication  resources.  A  block  diagram  of  the  distributed 
estimation/communication  system  is  depicted  in  Figure  1. 

This  type  of  system  is  equivalent  to  indirect  multiterminal  source  coding,  first  studied  in  [1]  and 
referred  to  as  the  central  estimation  officer  (CEO)  problem.  In  contrast  to  the  direct 
multiterminal  source  coding  problem  [2],  where  sensors  separately  measure  different  but 
correlated  sources  and  the  fusion  center  attempts  to  rebuild  every  source  as  accurately  as  possible 
subject  to  a  sum-rate  constraint,  each  of  the  sensors  in  the  CEO  problem  receives  a  noisy 
observation  of  the  same  source,  which  is  later  reconstructed  at  the  fusion  center.  There  has  been 
a  considerable  body  of  work  published  on  the  Gaussian  CEO  problem,  including  establishment 
of  rate  regions  for  memoryless  sources,  rate-distortion  trade-offs  for  different  types  of  source-to- 
destination  networks,  and  the  development  of  specific  coding  schemes  to  approach  the  derived 
performance  bounds  [3-19]. 

Except  for  a  brief  discussion  in  [20],  all  previous  studies  have  assume  the  sample  sequence 
generated  by  the  source  is  memoryless.  In  our  work,  we  have  studied  the  achievable  sum-rate 
problem  for  a  Gaussian  scalar  source  with  arbitrary  memory.  We  have  formulated  the  sum-rate 
calculation  as  a  variational  calculus  problem  with  a  distortion  constraint,  and  we  have  shown 
how  to  find  a  necessary  condition  which  the  solution  to  the  problem  must  satisfy.  Furthermore, 
we  can  provide  a  sufficient  condition  for  determining  if  the  necessary  solution  achieves  the 
minimal  sum-rate  perfonnance.  We  demonstrate  how  to  compute  the  rate-distortion  curve,  and 
we  have  shown  that  our  solution  is  compatible  with  previous  findings  in  rate-distortion  theory. 
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For  the  special  case  of  a  system  with  two  sensor  nodes,  we  have  derived  an  analytic  expression 
for  the  solution  to  the  sum-rate  problem. 


' - s 

Figure  1  —  Indirect  multiterminal  source  coding,  or  the  "CEO  Problem." 


Estimation  in  Analog  Sensor  Networks  (Static  Parameter) :  Recently,  considerable  research  has 
focused  on  the  fusion  of  analog  rather  than  encoded  digital  data  in  distributed  sensor  networks  to 
improve  estimation  performance.  The  advantages  of  analog  WSNs  have  been  established  in  [21- 
23],  where  it  was  shown  that  when  using  distortion  between  the  source  and  recovered  signal  as 
the  perfonnance  metric,  digital  transmission  (separate  source  and  channel  coding)  achieves  an 
exponentially  worse  perfonnance  than  analog  signaling. 

A  general  analog  WSN  scenario  is  investigated  in  [24],  involving  vector  observations  of  a 
vector-valued  random  process  at  the  sensors,  and  linearly  precoded  vector  transmissions  from  the 
sensors  to  a  multi-antenna  FC.  Optimal  solutions  for  the  precoders  that  minimize  the  mean- 
squared  error  (MSE)  at  the  FC  are  derived  for  a  coherent  MAC  under  power  and  bandwidth 
constraints.  In  [25],  single-antenna  sensors  amplify  and  forward  their  observations  to  a  multi¬ 
antenna  FC,  but  it  is  shown  that  for  Rayleigh  fading  channels,  the  improvement  in  estimate 
variance  is  upper  bounded  by  only  a  factor  of  two  compared  to  the  case  of  a  single-antenna  FC. 
Subsequent  results  by  the  same  authors  in  [26,27],  have  demonstrated  that  when  the  channel 
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undergoes  (zero-mean)  Rayleigh  fading,  there  is  a  limit  to  the  improvement  in  detection 
performance  for  a  multi-antenna  FC  as  well,  but  when  the  channel  is  Rician,  performance 
improves  monotonically  with  respect  to  number  of  antennas. 

Some  prior  research  in  radar  and  communications  has  focused  on  scenarios  where  the 
beamformer  weights  implement  only  a  phase  shift  rather  than  both  a  gain  and  a  phase.  The 
advantage  of  using  phase  shifting  only  is  that  it  simplifies  the  implementation  and  is  easily 
performed  with  analog  hardware.  Phase-shift-only  beamformers  have  most  often  been  applied  to 
receivers  that  null  spatial  interference  [28,29],  but  it  has  also  been  considered  on  the  transmit 
side  for  MISO  wireless  communications  systems  [30].  For  the  distributed  WSN  estimation 
problem,  phase-only  sensor  transmissions  have  been  proposed  in  [31],  where  the  phase  is  a 
scaled  version  of  the  observation  itself.  Phase-only  transmissions  were  also  considered  in  the 
context  of  distributed  detection  in  [26]. 

For  this  project,  we  have  studied  a  distributed  WSN  with  single-antenna  sensors  that  observe  an 
unknown  deterministic  parameter  corrupted  by  noise.  The  low-complexity  sensors  apply  a  phase 
shift  (rather  than  both  a  gain  and  phase)  to  their  observation  and  then  simultaneously  transmit  the 
result  to  a  multi-antenna  FC  over  a  coherent  MAC.  The  FC  determines  the  optimal  value  of  the 
phase  for  each  sensor  in  order  to  minimize  the  ML  estimation  error,  and  then  feeds  this 
information  back  to  the  sensors  so  that  they  can  apply  the  appropriate  phase  shift.  The  estimation 
performance  of  the  phase-optimized  sensor  network  has  been  shown  to  be  considerably 
improved  compared  with  the  non-optimized  case,  and  close  to  that  achieved  by  sensors  that  can 
adjust  both  the  transmit  gain  and  phase.  We  analyzed  the  asymptotic  behavior  of  the  algorithm 
for  a  large  number  of  sensors  and  a  large  number  of  antennas  at  the  FC.  In  addition,  we  analyzed 
the  impact  of  phase  errors  at  the  sensors  due,  for  example,  to  errors  in  the  feedback  channel,  a 
time-varying  main  channel  or  phase-shifter  drift.  We  also  considered  a  sensor  selection 
problem,  and  analyzed  its  asymptotic  behavior  as  well.  Some  particular  findings  of  our  research 
are  highlighted  below. 

•  We  have  derived  two  algorithms  for  determining  the  phase  factors  used  at  each  sensor. 
In  the  first,  we  use  semi-definite  relaxation  to  convert  the  original  problem  to  a 
semidefinite  programming  (SDP)  problem  that  can  be  efficiently  solved  by  interior-point 
methods.  For  the  second  algorithm,  we  apply  the  analytic  constant  modulus  algorithm 
(ACMA)  [32],  which  provides  a  considerably  simpler  closed-fonn  solution.  Despite  the 
reduction  in  complexity,  the  performance  of  ACMA  is  shown  via  simulation  to  be  only 
slightly  worse  than  the  SDP  solution,  and  close  to  the  theoretical  lower  bound  on  the 
estimate  variance.  This  is  especially  encouraging  for  networks  with  a  large  number  of 
sensors  N,  since  the  SDP  complexity  is  on  the  order  of  A3'5,  while  that  for  ACMA  is  only 
on  the  order  of  N2. 
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•  We  have  separately  derived  performance  scaling  laws  with  respect  to  the  number  of 
antennas  and  the  number  of  sensors  assuming  non-fading  channels  that  take  path  loss  into 
account.  For  both  cases,  we  derived  conditions  that  detennine  whether  or  not  the 
presence  of  multiple  antennas  at  the  FC  provides  a  significant  benefit  to  the  estimation 
performance.  Prior  work  in  [25-27]  has  focused  on  either  AWGN  channels  with  identical 
channel  gains,  or  on  fading  channels  where  the  channel  gains  are  identically  distributed, 
corresponding  to  the  case  where  the  distances  from  the  sensors  to  the  FC  are  roughly  the 
same.  References  [25-27]  also  assume  a  special  case  where  the  noise  at  each  of  the 
sensors  has  the  same  variance,  although  [27]  examines  how  certain  upper  bounds  on 
performance  change  when  the  sensor  noise  is  arbitrarily  correlated. 

•  Using  our  model  for  the  non- fading  case,  we  are  able  to  elucidate  detailed  conditions 
under  which  the  asymptotic  estimation  performance  will  improve  with  the  addition  of 
more  antennas  M  at  the  FC.  While  [25,26]  showed  that  performance  always  improves 
with  increasing  M  for  AWGN  channels  with  identical  gains  and  identically  distributed 
sensor  noise,  we  derive  more  detailed  conditions  that  take  into  account  the  possibility  of 
non-unifonn  distances  between  the  sensors  and  FC  and  non-uniform  noise  at  the  sensors. 

Estimation  in  Analog  Sensor  Networks  (Dynamic  Parameter) :  As  described  above,  most  prior 
work  on  estimation  in  distributed  amplify-and-forward  sensor  networks  has  focused  on  the 
situation  where  the  parameter(s)  of  interest  are  time-invariant,  and  either  deterministic  or 
i.i.d.  Gaussian.  An  exception  is  the  recent  work  by  Leong  et  al,  who  model  the  (scalar) 
parameter  of  interest  using  a  dynamic  Gauss-Markov  process  and  assume  the  FC  employs  a 
Kalman  filter  to  track  the  parameter  [33,34].  In  [33],  both  the  orthogonal  and  coherent  MAC 
were  considered  and  two  kinds  of  optimization  problems  were  formulated:  MSE  minimization 
under  a  global  sum  transmit  power  constraint,  and  sum  power  minimization  problem  under 
an  MSE  constraint.  An  asymptotic  expression  for  the  MSE  outage  probability  was  also  derived 
assuming  a  large  number  of  sensor  nodes.  The  problem  of  minimizing  the  MSE  outage 
probability  for  the  orthogonal  MAC  with  a  sum  power  constraint  was  studied  separately  in  [34]. 

Our  work  has  focused  on  the  coherent  MAC  case  assuming  a  dynamic  parameter  that  is  tracked 
via  a  Kalman  filter  at  the  FC.  As  detailed  in  the  list  of  contributions  below,  we  have  extended  the 
work  of  [33]  for  the  case  of  a  global  sum  power  constraint,  and  we  go  beyond  [33]  to  study 
problems  where  either  the  power3of  the  individual  sensors  is  constrained,  or  the  goal  is  to 
minimize  the  peak  power  consumption  of  individual  sensors: 

•  We  derived  a  closed-form  expression  for  the  optimal  complex  transmission  gains  that 
minimize  the  MSE  under  a  constraint  on  the  sum  power  of  all  sensor  transmissions. 
While  this  problem  was  also  solved  in  [33]  using  the  KKT  conditions  derived  in  [24],  our 
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approach  results  in  a  simpler  and  more  direct  solution.  We  also  examine  the  asymptotic 
form  of  the  solution  for  high  total  transmit  power  or  high  noise  power  at  the  FC. 

•  We  derived  a  closed-fonn  expression  for  the  optimal  complex  transmission  gain  that 
minimizes  the  sum  power  under  a  constraint  on  the  MSE.  In  this  case,  the  expression 
depends  on  the  eigenvector  of  a  particular  matrix.  Again,  while  this  problem  was  also 
addressed  in  [33],  the  numerical  solution  therein  is  less  direct  than  the  one  we  obtain.  In 
addition,  we  found  an  asymptotic  expression  for  the  sum  transmit  power  for  a  large 
number  of  sensors. 

•  We  have  shown  how  to  find  the  optimal  transmission  gains  that  minimize  the  MSE  under 
individual  sensor  power  constraints  by  relaxing  the  problem  to  an  SDP,  and  then  proving 
that  the  optimal  solution  can  be  constructed  from  the  SDP  solution. 

•  We  have  shown  how  to  find  the  optimal  transmission  gains  that  minimize  the  maximum 
individual  power  over  all  of  the  sensors  under  a  constraint  on  the  maximum  MSE.  Again, 
we  solved  the  problem  using  SDP,  and  then  proved  that  the  optimal  solution  can  be 
constructed  from  the  SDP  solution. 

•  For  the  special  case  where  the  sensor  nodes  use  equal  power  transmission,  we  derived  an 
exact  expression  for  the  MSE  outage  probability. 

UAV Positioning for  Communications:  In  military  or  disaster  response  (e.g.,  fire  fighting) 
scenarios,  users  on  the  ground  require  reliable  communications  with  each  other  and  their 
command  center.  Such  scenarios  often  occur  in  environments  without  a  fixed  communications 
infrastructure  (e.g.,  a  centralized  basestation  as  in  cellular  networks),  and  thus  the  network  must 
operate  in  a  peer-to-peer  or  ad  hoc  manner.  The  users  and  the  command  center  may  be  separated 
by  distances  greater  than  the  range  of  their  communication  devices,  or  the  signals  may  be 
shadowed  due  to  mountainous  terrain  or  dense  surroundings  (forests,  buildings,  etc.). 
Furthermore,  since  the  users  are  mobile,  the  communications  environment  is  constantly  changing 
and  thus  connectivity  is  often  only  sporadic.  Unmanned  aerial  vehicles  (UAVs)  acting  as 
airborne  relays  (essentially  “flying  basestations”)  provide  an  attractive  solution  to  problems 
encountered  in  such  scenarios  since  their  altitude  allows  them  to  get  above  the  ground-based 
shadowing  and  obtain  line-of-sight  (LOS)  or  near  LOS  communication  channels  over  a  large 
area.  Also  and  perhaps  most  importantly,  the  inherent  mobility  of  UAVs  allows  their  position  to 
be  adjusted  in  order  to  best  accommodate  the  evolving  network  topology.  We  have  considered 
such  an  application  under  this  project,  assuming  a  system  with  a  multi-antenna  UAV  flying  over 
a  collection  of  single-antenna  mobile  ground  nodes.  The  UAV  acts  as  a  relay,  collecting  the 
messages  from  the  co-channel  users  on  the  ground  in  order  to  forward  them  to  other  ground- 
based  users  or  some  remote  base  station.  The  goal  is  to  show  how  to  control  the  motion 
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of  the  UAV  so  as  to  optimize  the  uplink  communications  performance. 


In  particular,  in  our  work  we  have  investigated  the  problem  of  positioning  a  multiple  antenna 
UAV  for  enhanced  uplink  communications  from  multiple  ground-based  users.  We  studied  the 
optimal  UAV  trajectory  for  a  case  involving  two  static  users,  and  derived  an  approximate 
method  for  finding  this  trajectory  that  only  requires  a  simple  line  search.  For  the  case  of  a 
network  of  mobile  ground  users,  we  developed  an  adaptive  heading  algorithm  that  uses 
predictions  of  the  user  tenninal  positions  and  beamfonning  at  the  UAV  to  maximize  SINR  at 
each  time  step.  Two  kinds  of  optimization  problems  were  considered,  one  that  maximizes  an 
approximation  to  the  average  uplink  sum  rate  and  one  that  guarantees  fairness  among  the  users 
using  the  proportional  fair  method.  Our  simulation  studies  indicate  the  effectiveness  of  the 
algorithms  in  automatically  generating  a  suitable  UAV  heading  for  the  uplink  network,  and 
demonstrate  the  benefit  of  using  space-division  multiple  access  (SDMA)  over  time-division 
multiple  access  (TDMA)  in  achieving  the  best  throughput  performance.  We  also  derived 
approximate  solutions  to  the  UAV  heading  problem  for  low-  and  high  SNR  scenarios;  the 
approximations  allow  for  a  closed-form  solution  instead  of  a  line  search,  but  still  provide  near- 
optimal  performance  in  their  respective  domains. 
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PART  II:  NETWORK  CODING  and  INFERENCE 

Overview:  When  a  communication  network  employs  linear  network  coding  at  intennediate 
nodes,  it  essentially  acts  as  a  linear  system  whose  transfer  function  depends  primarily  on  the 
network  topology  and  secondarily  on  the  network  coding  coefficients.  In  this  part  of  the  project, 
we  exploited  this  intimate  relation  between  network  coding  and  network  topology  for  inference 
problems.  In  particular,  we  revisited  network  tomography  and  we  designed  novel  active  probing 
and  estimation  techniques. 

List  of  publications: 

[THESIS]  P.  Sattari,  "Network  Coding  for  Network  Tomography,"  Ph.D.  Thesis, 
University  of  California,  Irvine,  May  2012.  (The  main  contributions  [NCI,  NC2,  NC5, 
NC6,  NC8]  are  summarized  below.) 

[NCI]  P.  Sattari,  A.  Markopoulou,  C.  Fragouli,  M.  Gjoka,  "A  Network  Coding 
Approach  to  Loss  Tomography,"  in  IEEE  Transactions  on  Information  Theory,  Vol.  59, 
Issue  3,  pp.  1532  -  1562,  March  2013 

[NC2]  P.  Sattari,  C.  Fragouli,  A.  Markopoulou,  "Active  Topology  Inference  using 
Network  Coding,"  in  the  Elsevier  Physical  Communication,  Special  Issue  on  Network 
Coding  and  its  Applications  to  Wireless  Communications,  Vol.  6,  pp.  142  -  163,  March 
2013. 

[NC3]  Chun  Meng,  Hulya  Seferoglu,  A.  Markopoulou,  Kenneth  W.  Shunt,  Chung  Chan, 
MPC:  Multicast  Packing  for  Coding  across  Multiple  Unicasts",  in  Proc.  ofNetCod  2013, 
June  2013.  Technical  report. 

[NC4]  A.  Le,  A.  Tehrani,  A.  Dimakis,  and  A.  Markopoulou,  "Instantly  Decodable 
Network  Codes  for  Real-Time  Applications",  in  Proc.  ofNetCod  2013,  June  2013 

[NC5]  P.  Sattari,  Maciej  Kurant,  Animashree  Anandkumar,  A.  Markopoulou,  Michael 
Rabbat,  Active  Learning  of  Multiple  Source  Multiple  Destination  Topologies,  in  Proc.  of 
CISS  2013,  March  2013. 

[NC6]  P.  Sattari,  A.  Markopoulou,  C.  Fragouli,  "Maximum  Likelihood  Estimation  for 
Multiple-Source  Loss  Tomography  with  Network  Coding,"  in  Proc.  ofNetCod  2011,  pp. 
5-11,  Beijing,  China,  July  25-21,  2011. 

[NC7]  A.Le,  A.  Markopoulou,  "TESLA-Based  Defense  Against  Pollution  Attacks  in  P2P 
Systems  with  Network  Coding,"  in  Proc.  ofNetCod  2011,  Beijing,  China,  July  2011. 

[NC8]  P.  Sattari,  A.  Markopoulou,  "Algebraic  Traceback  Meets  Network  Coding,"  in  Proc.  of 
NetCod  2011  (Poster  Session),  pp.  253-259,  Beijing,  China,  July  2011. 
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[0SN1]  M.Gjoka,  M.Kurant,  A.Markopoulou,  "2.5K-Graphs:  from  Sampling  to 
Generation",  in  Proc.  of  IEEE  INFOCOM  2013,  Turin,  Italy,  March  2013. 

[OSN2]  F.  Malandrino,  M.Kurant,  A.  Markopoulou,  C.Westphal,  U.Kozat,  "Proactive 
Seeding  for  Information  Cascades  in  Cellular  Networks",  in  Proc.  of  IEEE  INFOCOM 
2012,  Orlando,  FL,  March  2012. 


Summary  of  Technical  Results 

Network  tomography  aims  at  inferring  internal  network  characteristics,  such  as  topology  and/or 
link-level  characteristics  (such  as  loss  rate  or  delay),  based  on  measurements  at  the  edge  of  the 
network.  There  is  a  significant  body  of  prior  work  dedicated  to  this  problem  using  multicast 
and/or  unicast  end-to-end  probes.  Independently,  recent  advances  in  network  coding  have  shown 
that  there  are  several  advantages  from  allowing  intennediate  nodes  to  process  and  combine,  in 
addition  to  just  forward,  packets. 

In  this  part  of  the  project,  we  revisit  the  problem  of  network  tomography  (which  allows  us  to 
send  probes  between  sources  and  receivers  at  the  edge  of  the  network),  with  network  coding 
(which  allows  intermediate  nodes  to  perform  simple  coding  operations  on  incoming  packets). 
We  showed  that  network  coding  offers  several  benefits  in  terms  of  complexity,  accuracy,  and 
bandwidth  savings.  Our  key  intuition  is  that  network  coding  at  intermediate  nodes  introduces 
topology-dependent  correlation  in  the  content  of  coded  packets,  which  can  then  be  exploited  for 
inferring  the  coding  points.  We  made  the  following  contributions  in  this  area: 

•  First,  we  revisited  multiple-source  loss  tomography  in  tree  topologies  with  multicast  and 
network  coding  capabilities,  and  we  provide,  for  the  first  time,  a  low-complexity 
Maximum  Likelihood  Estimator  (MLE)  for  the  link  loss  rates  [NCI,  NC6].  In  addition  to 
the  MLE,  we  also  applied  and  evaluated  message-passing  algorithms  for  link  loss 
estimation,  both  in  trees  and  in  general  topologies. 

•  Second,  we  studied  the  topology  inference  problem  in  multiple-source  multiple-receiver 
(M-by-N)  networks  [NC2].  We  built  on  prior  work  by  one  of  our  collaborators  (M. 
Rabbat),  which  infers  a  general  M-by-N  topology  by  first  inferring  several  2-by-2 
subnetwork  components,  and  then  merging  them  to  obtain  the  M-by-N  topology.  We 
showed  that,  with  simple  network  coding  operations  at  intermediate  nodes,  it  is  possible 
to  perfectly  identify  every  2-by-2  component,  which  is  not  possible  using  only  multicast 
or  unicast  probes.  Furthermore,  we  proposed  a  new  algorithm  for  merging  all  2-by-2 
components  to  obtain  the  M-by-N  topology.  We  cast  the  problem  as  multiple  hypotheses 
testing  (in  particular,  generalized  binary  search)  [NC5]  and  we  designed  and  analyzed  a 
greedy  algorithm  that  adaptively  selects  which  2-by-2  components  to  measure  so  as  to 
minimize  the  number  of  measurements  needed  to  infer  the  M-by-N  topology. 

•  Third,  we  revisited  the  traceback  problem,  which  arises  in  the  context  of  denial-of-service 
attacks,  where  multiple  attack  sources  flood  a  victim  destination  by  sending  a  large 
number  of  packts.  The  goal  of  traceback  is  to  identify  the  paths  traversed  by  these 
malicious  packets  all  the  way  back  to  the  attack  sources,  by  allowing  intermediate  nodes 
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to  mark  a  dedicated  held  on  headers  of  packets  passing  through  them  with  the  node  id. 
We  incorporated,  for  the  first  time,  network  coding  in  two  different  types  of  traceback 
schemes:  probabilistic  packet  marking  schemes  and  algebraic  traceback  [NC8].  In 
probabilistic  packet  marking,  routers  probabilistically  mark  packets  with  (a  function  of) 
their  router  id.  We  demonstrated  the  benefit  of  network  coding,  by  essentially  reducing 
the  traceback  problem  to  a  coupon  collector's  problem.  In  contrast,  algebraic  traceback 
encodes  the  ids  of  routers  on  a  single  path  as  coefficients  in  a  polynomial  of  a  single 
variable.  We  extended  that  idea  to  encode  multiple  paths  into  a  multivariate  polynomial 
and  we  establish  an  interesting  mapping  between  multi-path  algebraic  traceback  and  a 
particular  network  coding  problem. 

The  aforementioned  study  of  inference  in  network  coded  networks,  led  to  a  deeper  understanding 
of  the  effect  of  topology  in  other  network  coding  problems,  such  as  security  and  pollution  attacks 
[NC7],  instantly  decodable  network  coding  [NC4],  and  constructive  inter-session  network  coding 
schemes  [NC3].  Furthermore,  it  provided  valuable  insight  into  a  different  thread  of  research  in 
our  group,  which  investigated  the  topology  of  online  social  networks  with  target  applications 
simulation  [OSN1]  and  information  diffusion  [OSN2]. 
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PART  III:  LEARNING  GRAPH-BASED  MODELS 


Overview:  This  part  of  the  project  deals  with  graphs  beyond  communications  networks  and 
seeks  to  learn  network  structure  using  adaptive  techniques.  Today  we  are  facing  a  "data  deluge” 
in  almost  every  domain.  The  collected  data  in  many  domains  are  noisy,  subsampled,  with 
typically  a  large  number  of  variables  or  "unknowns"  compared  to  the  number  of  observations  or 
the  "knowns".  Such  high-dimensionality  entails  practical  principled  approaches  for  learning 
from  ill-posed  and  ill-behaved  data.  As  part  of  this  project,  we  tackled  high  dimensional  learning 
by  exploiting  inherent  data  structure,  either  in  the  form  of  structural  relationships  among  the 
variables,  represented  as  graphs  or  as  parametric  forms,  represented  as  tensor  decompositions. 
Below  we  summarize  the  publications  and  some  key  results  in  learning  of  graph-based  models. 

List  of  Publications: 


A.  Anandkumar,  A.  Hassidim,  and  J.  Kelner.  Topology  Discovery  of  Sparse  Random  Graphs 
With  Few  Participants.  In  Proc.  of  ACM  SIGMETRICS,  June  2011.  Winner  of  Best  Paper 
Award. 

A.  Anandkumar,  K.  Chaudhuri,  D.  Hsu,  S.M.  Kakade,  L.  Song,  and  T.  Zhang.  Spectral  Methods 
for  Learning  Multivariate  Latent  Tree  Structure.  In  Proc.  of  Neural  Infonnation  Processing 
(NIPS),  Dec.201 1. 

A.  Anandkumar,  V.  Y.  F.  Tan,  and  A.  S.  Willsky.  High-Dimensional  Graphical  Model  Selection: 
Tractable  Graph  Families  and  Necessary  Conditions.  In  Proc.  of  Neural  Information  Processing 
(NIPS),  Dec.  2011. 

A.  Anandkumar,  V.Y.F  Tan,  F.  Huang,  and  A.S.  Willsky.  “High-Dimensional  Structure 
Learning  of  Ising  Models:  Local  Separation  Criterion”.  Annals  of  Statistics,  Volume  40,  Number 
3  (2012),  1346-1375. 

Anandkumar,  V.Y.F  Tan,  F.  Huang,  and  A.S.  Willsky.  “High-Dimensional  Gaussian  Graphical 
Model  Selection:  Walk-Summability  and  Local  Separation  Criterion”.  A.  J.  Machine  Learning 
Research,  13:2293-2337,  Aug.  2012 

A.  Anandkumar,  D.  Hsu,  and  S.M.  Kakade.  A  Method  of  Moments  for  Mixture  Models  and 
Hidden  Markov  Models.  In  Proc.  of  Conf.  on  Learning  Theory,  June  2012. 

M.  Janzamin  and  A.  Anandkumar.  High-Dimensional  Covariance  Decomposition  into  Sparse 
Markov  and  Independence  Domains.  In  Proc.  of  International  Conf.  on  Machine  Learning,  June 
2012. 

A.  Anandkumar,  D.  Hsu,  F.  Huang,  and  S.M.  Kakade.  Learning  Mixtures  of  Tree  Graphical 
Models.  In  Proc.  of  Neural  Information  Processing  (NIPS),  Dec.  2012. 
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A.  Anandkumar  and  R.  Valluvan.  Learning  Loopy  Graphical  Models  with  Latent  Variables: 
Efficient  Methods  and  Guarantees.  In  Proc.  of  Neural  Information  Processing  (NIPS),  Dec.  2012. 
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Summary  of  Technical  Results: 

Probabilistic  Graphical  Models:  One  graphical  framework  for  rep-resenting  high-dimensional 
data  is  that  of  probabilistic  graphical  models,  also  known  as  Markov  random  fields  or  Markov 
networks.  A  Markov  network  represents  complex  relationships  between  data  at  different  nodes  in 
the  form  of  a  graph,  known  as  the  dependency  graph.  Mathematically,  any  two  sets  of  nodes  A 
and  B  are  conditionally  independent,  conditioned  on  the  separator  set  S:  XA  1  XB|XS.  Hence, 
the  data  at  each  node  is  influenced  by  its  neighbors  in  the  dependency  graph.  A  Markov 
representation  is  succinct  with  a  much  smaller  number  of  parameters  than  the  number  of  data 
dimensions  (variables),  and  it  explicitly  encodes  the  relationships  between  the  variables. 

Formulation  of  Learning  from  Data:  Given  n  i.i.d.  data  samples  xn  :=  [x(l),  x(2),  .  .  .  ,  x(n)]T 
from  a  graphical  model  P  with  Markov  graph  G,  the  goal  is  to  estimate  the  underlying  graph.  We 
developed  methods  and  provided  consistency  guarantees  for  graph  estimation  in  the  high 
dimensional  regime. 

Structure  Learning  with  Hidden  Variables:  Developing  tractable  methods  to  discover  hidden 
nodes  and  the  overall  graph  structure(s)  (and  parameters)  was  an  important  goal  of  this  project. 
Co-PI  Anandkumar  has  developed  efficient  methods  for  learning  latent  variable  models  in  a 
variety  of  settings.  This  includes  the  development  of  novel  methods  for  learning  hidden  tree 
models.  The  developed  algorithms  have  low  sample  complexity  and  are  much  faster  and  more 
robust  than  the  state  of  art.  The  algorithm,  at  a  high  level,  maintains  a  tree  model  in  each 
iteration  and  adds  hidden  variables  by  conducting  local  tests.  This  property  is  unique  to  our 
approach  and  makes  it  amenable  for  applying  it  to  real  data  since  we  can  tradeoff  model 
complexity  and  data  fitting  in  a  principled  and  an  efficient  manner.  We  extended  these  methods 
for  learning  latent  loopy  models  with  long  cycles  [NIPS’2012],  and  demonstrated  effectiveness 
in  financial  and  topic  modeling. 

Bayesian  Networks  with  Latent  Variables:  In  addition  to  incorporating  latent  variables,  it  is 

important  to  model  the  complex  dependencies  among  the  variables.  In  [ICML’13],  we  provided 
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novel  methods  for  learning  directed  acyclic  graphs  (DAG)  with  hidden  variables.  The  method  is 
based  on  the  intuition  that  learning  is  tractable  when  there  is  sufficient  expansion  in  the  DAG 
from  hidden  to  observed  variables  (e.g.  when  it  is  latent  tree  or  has  small  number  of  collidiers, 
i.e.,  nodes  with  multiple  parents).  This  work  combines  sparse  dictionary  learning  with  method  of 
moments  in  a  novel  manner  and  is  the  first  work  to  provide  guaranteed  learning  for  latent 
Bayesian  networks.  This  has  implications  in  many  practical  settings,  e.g.  for  learning  correlated 
topic  models. 

Modeling  Using  Multiple  Graphs:  Modeling  high-dimensional  data  involves  a  delicate  trade-  off 
between  faithful  representation  and  parsimony.  Models  that  are  sparse  in  some  domain  achieve  a 
parsimonious  representation  but  may  poorly  fit  the  given  data.  We  has  developed  frameworks  for 
relaxing  the  sparsity  constraints  without  sacrificing  on  parsimony  in  high  dimensions.  One 
framework  involves  incorporating  hidden  factorswhich  can  change  the  structural  (and 
parametric)  relationships  among  the  observed  variables  [NIPS’  13],  thereby  resulting  in  a  mixture 
of  probabilistic  graphical  models.  We  developed  methods  with  guaranteed  recovery  of  mixture 
components  that  are  also  efficient  for  practical  implementation.  We  also  considered  another 
approach  for  modeling  with  multiple  graphs.  In  [ICML’12],  the  observed  data  is  fitted  to  a 
combination  of  a  sparse  graphical  model  and  a  sparse  independence  model,  thereby 
incorporating  different  kinds  of  statistical  relationships  among  the  variables.  We  developed  novel 
decomposition  methods  based  on  convex  relaxation  with  guaranteed  recovery  in  both  the 
domains. 

Finally,  we  applied  the  above  developed  algorithms  to  a  number  of  practical  problems,  including 
financial  and  document  modeling,  object  recognition  in  computer  vision,  to  track  the  evolution 
of  dynamic  social  networks  [Sunbelt  2012]  and  to  model  gene  associations.  We  have  shown  a 
huge  improvement  over  previous  ones  in  all  these  instances. 
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